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Abstract 

The aim of this paper is to design a theoretical framework that allows 
us to perform the computation of regular expression derivatives through 
a space of generic structures. Thanks to this formalism, the main prop- 
erties of regular expression derivation, such as the finiteness of the set of 
derivatives, need only be stated and proved one time, at the top level. 
Moreover, it is shown how to construct an alternating automaton associ- 
ated with the derivation of a regular expression in this general framework. 
Finally, Brzozowski's derivation and Antimirov's derivation turn out to be 
a particular case of this general scheme and it is shown how to construct 
a DFA, a NFA and an AFA for both of these derivations. 

1 Introduction 

The (left) quotient of a language L over an alphabet E with respect to a word 
w in E* is the language obtained by stripping the leading w from the words in 
L that are prefixed by w. The quotient operation plays a fundamental role in 
language theory and is especially involved in two main issues. First, checking 
whether a word w belongs to a language L turns out to be equivalent to checking 
whether the empty word belongs to the quotient of L w.r.t. w. Secondly it was 
proved by Myhill [T7] and Nerode |18] that a language is regular if and only 
if the set of its quotients w.r.t. all the words in S* is finite. In the case of a 
regular language L, the different quotients are the states of the so-called quotient 
automaton of L that is isomorphic to its minimal deterministic automaton. 

Since the equality of two languages amounts to the isomorphism of their 
minimal deterministic automata, the construction of the quotient automaton 
via the computation of the left quotients is intractable. The seminal work of 
Brzozowski [3] that introduced the notion of a word derivative of a regular 
expression and the construction of the associated DFA, gave rise to a long series 
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of studies (see for example [31[TJ[Hl[Tni[7]) that are all based on the simulation of 
the computation of a language quotient by the one of an expression derivative. 
In all these research works, the first rule is that if the expression E denotes the 
language L, then the derivative of E w.r.t. w, for any w, denotes the quotient of 
L w.r.t. the word w. Thus, checking whether a word w belongs to the language 
denoted by the expression E is equivalent to checking if the empty word belongs 
to the language denoted by the derivative of E w.r.t. w. The second rule is 
that as far as the set D of all the derivatives of E is finite, a finite automaton 
recognizing the language denoted by E can be constructed, admitting as a 
set of states. 

Let us notice that Brzozowski derivatives |4j handle unrestricted regular ex- 
pressions and provide a deterministic automaton; Antimirov derivatives [T] only 
address simple regular expressions and provide both a deterministic automaton 
and a non-deterministic one; Antimirov derivatives have been recently extended 
to regular expressions [7] and this extension provides a deterministic automaton, 
a non-deterministic one and, as shown in this paper, an alternating automaton. 
Berry and Sethi continuations [3] are based on the linearization of the (simple) 
input expression and allow the construction of its Glushkov (non-deterministic) 
automaton. Champarnaud and Ziadi c-continuations [9] and Hie and Yu deriva- 
tives allow both the construction of the Glushkov automaton and of the 
Antimirov non-deterministic automaton. Let us mention that derivation has 
been extended to expressions with multiplicity [TSl [S] . 

As mentioned by Antimirov [1], derivatives of regular expressions have proved 
to be a productive concept to investigate theoretical topics such as the algebra 
of regular expressions [11] or of X-regular expressions [H] , the systems of lan- 
guage equations [6], the equivalence of simple regular expressions |12| or of 
regular expressions P]. More recently, Brzozowski introduced a new approach 
for finding upper bounds for the state complexity of regular languages, based 
on the counting of their quotients (or of their derivatives) . 

Moreover, derivatives provide a useful tool to implement regular match- 
ing algorithms: Brzozowski's DFA and Antimirov's NFA turn out to be com- 
petitive matching automata |20j, compared for instance with Thompson's e- 
automaton [5T]. The derivative-based techniques are well-suited to functional 
languages, that are characterized by a good support for symbolic term manip- 
ulation. As an example, two derivative-based scanner generators have been 
recently developed, one for PLT Scheme and one for Standard ML, as reported 
in [19) . Similarly, Brzozowski's derivatives are used in the implementation of the 
XML schema language RELAX NG ^lOj. Finally, let us notice that derivatives 
can be extended to context-free grammars, seen as recursive regular expressions, 
yielding a system for parsing context-free grammars jl6|. 

The aim of this paper is to design a general framework where the com- 
putation of the set of derivatives of a regular expression, called derivation, is 
performed over a space of generic structures. Of course Brzozowski's derivation 
and Antimirov's one appear as particular cases of this general scheme. A first 
benefit of this formalism is that the properties inherent to the mechanism of 
derivation, such as the equality between the language denoted by a derivative 



2 



and the corresponding quotient, the finiteness of the set of derivatives and the 
way for constructing the associated automata, need only be stated and proved 
one time, at the top level. A second benefit is that the general framework al- 
lows us to design the construction of an AFA from the set of derivatives. As 
a consequence, we show how to construct a DFA, a NFA and an AFA for any 
finite derivation, including both Brzozowski's one and Antimirov's one. 

The next section is a preliminary section; it gathers classical notions con- 
cerning regular languages, regular expressions and finite automata, as well as 
boolean formulas and alternating automata. The notion of a regular expression 
derivation via a support is defined in Section [31 and the properties of the corre- 
sponding derivatives are investigated. Section|3]is devoted to the construction of 
the alternating automaton associated with the derivation of a regular expression 
via a support. This construction is illustrated in Section [5l where the support 
is based on the set of clausal forms over the alphabet of the regular expressions. 



2 Preliminaries 

Let B — {0,1}. A boolean formula (j) over a set X is inductively defined 
hy 4> = X where cc G X, or = fB(0i, .■.,4>k), where fg is the operator 
associated to the fc-ary boolean function f from M'' to B and i^i , . . . , 0^ are 
boolean formulas over X. The set of the boolean formulas over X is denoted 
by BoolForm(X). Let v be a function from X to B. The evaluation of (p with 
respect to v is the boolean evalv((/>) inductively defined by: evalv(a;) = v(x), 
evalv(fB (</>!, . . • , 0fc)) = f(evalv(0i), . ■ . , eva\y{(f>k)). The set Atom((/)) is the sub- 
set of X inductively defined by Atom(a;) = {x} and Atom(fB(0i, = 
Ui<i<fcAtom(0j). 

Let E be an alphabet, w be a word in S* and L be a language over S. 
Deciding whether w belongs to L is called the membership problem for the 
language L. We denote by rt„(i) the boolean equal to 1 if w G L, otherwise. 
Let Li, . . . , Lk he k languages. We denote by •, * and f^ for any fc-ary boolean 
function f the operators defined as follows: Li ■ L2 — {wi ■ W2 G ^* \ r^-^{Li) A 
Tn,2{L2) = 1}, LI = {e}U{wi---Wn £ E* I n e NAVfc G {1, . . . , n}, r„,(Li) = 
1}, fL(Li, . . . ,Lfc) = {w £ T,* \ f(ru,(Li), . . . ,ru,(Lfc)) = !}■ The quotient 
of a regular language L with respect to a word w, that is defined as the set 
w~^{L) = {w' G E* I rtutu'(i) = 1} can be inductively computed as follows: 
e^^{L) ~ L, and for a G E, 

a^^{Li) ■ L2 U a"i(L2) if L = Li ■ L2 A e e Li, 
a"^(ii) • L2 if L = Li • L2 A £ ^ Li, 

a-^{Li)-Ll if LI, 

fL(a-i(ii), . . . , a'\L„)) if L - fL(ii, . . . ,L„), 
{e} ifL = {a}, 

otherwise, 
(a • wTHL) = w'-^{a-\L)) for w' G E+. 

An Alternating Automaton (AA) is a 5-tuple A = {Ti,Q,I,F,5) where E is 
an alphabet, Q is a set of states, / is a boolean formula over Q, _F is a function 
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from Q to B and (5 is a function from Q x E to BoolForm((5). The function 5 
is extended from BoolForm((5) x S* to BoolForm(<5) as follows: e) = 4>, 
5{(j),aw) = S{S{(p,a),w), S{iB{(pi, . . .,(t>k),a) = im{5{(i)\,a), . . .,5{cpk,a)) where a 
is any symbol in S, w is any word in S* and <f>i. . . . , (pi- arc any k boolean formu- 
las over Q. The accessible part of A is the alternating automaton (S, Q', I, F' , S') 
defined by: Q' = {q G Q \ 3w GT,*,q G Atom(5(7, w))}; e Q', F'{q) = Fiq); 
Va G S,V(7 G Q',S'{q,a) = S{q,a). The language recognized by the alternat- 
ing automaton A is the subset L{A) of E* defined by L{A) = {u; G E* | 
evalir(^(-?,«^)) = !}• Whenever Q is a finite set, A is said to be an Alternating 
Finite state Automaton (AFA). 

A regular expression E over an alphabet E is inductively defined by: E = a, 
E = I, E = Q, E = El- E2, E = El or E = U{Ei,. . .,E„), where VA; G N, Ek is 
a regular expression, a G E and fc is the operator associated to the /c-ary boolean 
function f, e.g. + is the operator associated to V. In the following, we assume 
that the regular expression operators as well as the boolean formula operators 
have no specific algebraic properties, unlike the boolean functions. For instance, 
the operator -|- is not associative, not commutative, nor idempotent. A regular 
expression is said to be simple if the only boolean operator used is the sum. 

The set of the regular expressions over an alphabet E is denoted by Exp(E). 
The language denoted by E is the subset L{E) of E* inductively defined as 
follows: L{E-F) = L{E) ■ L{F), L{E*) = {L{E))*, L{UEi, . . . ,E„)) = 
h{L{Ei),...,L{En)), L{a) = {a}, L(0) = 0, and L{1) = {e} with E and F 
any two regular expressions, a any symbol in E and fz, the operator associated 
with f (e.g. U is associated to V). Whenever two expressions Ei and E2 denote 
the same language, Ei and E2 are said to be equivalent (denoted by Ei E2). 
In the following, we denote by Tw{E) the boolean r.u,{L{E)). Notice that the 
boolean re{E) is straightforwardly computed as follows: 

r,(fe(^i, . . . , Ek)) = Kr,{Ei), r,{Ek)), 
r,{Ei-E2)=r,{Ei)Ar,{E2), 
re{El) - 1. 

Given a regular expression E over an alphabet S and a symbol a in S, the 
Brzozowski derivative of E w.r.t. a is the expression -^{E) inductively defined 
by: 

£(a) = l,£(6) = £(l) = £(0) = 0, 
£(fe(i;i, ...,Ek))= U^jEi), {-{Ek)), = • El, 

A(F F^-J £(^i)-^2 + £(i?2) ifre(£;i) = l, 
^^^r.i-r.2) <^ d_{Ei)-E2 otherwise, 

where b is any symbol in S\{o}, E\,. . .,Ek are any k regular expressions over 
E, and fe is the operator associated with the fc-ary boolean function f. Notice 
that Brzozowski defines the dissimilar derivative of E as the expression ^ [E) 
inductively computed by substituting the operator -\-aci to the + operator 
in the derivative formulas, where +aci is the associative, commutative and 
idempotent version of -|-. 
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Given a simple regular expression E over an alphabet S and a symbol a in 
E, the Antimirov partial derivative of E w.r.t. a is the set of expression ■§-{E) 
inductively defined by: 

f = {!}, f(5) = f(l) = f(O) = 0, 
f (£i + E2) = liE,) U 1{E2), liEl) = IJE,) . El, 
a(p I Ue,)-E^U-§-{E^) iir,{E,) = l, 

1^1 • - I . otherwise, 

where b is any symbol in E \ {a}, . . . , i?fe are any k regular expressions 
over E, and for any set £ of regular expression, for any regular expression F, 

£-f = \j^^,{e.f}. 



3 Derivation via a Support 

We now introduce the notion of a derivation via a support. Recall that a Br- 

zozowski derivative is an expression, whereas an Antimirov derivative is a set 
of expressions. In our framework, a derivative is more generally an element 
of an arbitrary set, called the structure set. For example, a structure can be 
an expression, a set of expressions or a set of set of expressions. A support is 
essentially made of a structure set equipped with operators, and of a mapping 
that transforms a structure into an expression. 

Definition 1. Let E be an alphabet. Let E be a set and h be a mapping from E 
to Exp(E). Let O be a set containing: 

• for any k-ary boolean function f, an operator fi from E*^ to E, 

• an operator -e from E x Exp(E) to E. 

Let 1e and Oe be two elements in E. The 6-tuple S = (E, E, h, O, 1e, Oe) is said 
to be a support if the three following conditions are satisfied: 

1. for any k elements £1, . . . ,£k in E.- 

h(fK(fi,...,£fe))~fe(h(fi),...,h(ffc)), 

2. for any element £ in E, for any expression E in Exp(E); 

h{£ -E E) - h(£:) • E, 

3. h(lE) - 1 and h(OE) ~ 0. 

Notice that the expressions h(fE(£^i, . . . , £k)) and fe(h(£i), . . . , h(£fe)) need 
not to be identical. They are only required to define the same language. A 
support is based on a sot of generic structures that can be used to handle 
regular expressions. We now define the notion of regular expression derivation 
via a support. 

Definition 2. Let § = (E, E, h, O, 1e, Oe) be a support. The derivation via § is 
the mapping D from E+ x Exp(E) to E inductively defined for any a e E, for 
any word w in E+ and for any expression E in Exp(E) by: 
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' UB{a,Ei),...,D{a,Ek)) E = ^Ei, . . . , Ek), 

(D(a, El) -E E2) Ve D(a, E2) E ^ Ei ■ E2 A ee L{Ei), 
I D(a,£;i)-E£;2 E^Ei-E2 A e(^L{Ei), 

u{a,i^)^< El) -E El E^E*i, 

1e E ^ a, 

^ Oe e S\{a}U{l,0}. 

D(ii;,£') = D(u,h(D(a,i;))) i/w = au A u e T.+ 

Furthermore, if for all expression E in Exp(I]), the set {D{w, E) \ w G S^} 
is finite, the derivation D is said to be a finite derivation. 



3.1 Classical Derivations are Derivations via a Support 

This subsection illustrates the fact that both Antimirov's derivation and Brzo- 
zowski's one are derivations via a support. 

Definition 3. We denote by Sa = (I;,E = 2'^''p(^), h^, Oa, {!}, 0) the 6-tuple 
defined by: 

. for any £ £ 2E-p(S), hA{£) = Esee 

• Oa = {fE \ i is a k-ary boolean function} U {-e} where for any elements 

- £-^F^{jE^,{E-F}, 

- fE(fi,...,ffe) ^fiUfa andi^y, 

- h{£i, ■ ■ ■ ,£k) = {'^e{^A{£i), ■ ■ ■ ,'^A{£k))} otherwise. 

Proposition 1. The 6-tuple Sa is a support. Furthermore, for any simple 
regular expression E over S, for any symbol a, it holds 0^(0, _E) = -^[E), 
where is the derivation via Sa- 

Proof. Let £1 and £2 be two sets of simple regular expressions and i? be a simple 
regular expression. The condition (1) of Definition [1] is satisfied since: 

(a) if A: = 2 and f = V, 

L(hA(£iU£2)) =L(EBe£,u£.^) 

= y^E.^,^ L{Ei)UljE..ee.L{E2) 

= L(hA(fl) + hA(f2)) 

(b) and otherwise, 

i(hA(fE(fl, . . . , £k)) = L(hA({fe(hA(^:i), . . . , hA(ffc))})) 
= L(f,(hA(5l),...,hA(£:fc))) 

The condition (2) of Definition [T] is satisfied since: 

LihAi£i -E E)) = LihAiUE^esA^i ■ E})) 
= HEe,^£, Ei-E) 
= L{T.E,^e, Ei)-L{E) 
= L{\,a{£i)) ■ L{E) 
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The condition (3) of Definition [T] is satisfied since: 

L(h^({l}))=i(l) 
L(h^(0)) = = L(0) 
Consequently, E>a is a support. Moreover, since the operators U and -e in Oa 
are the operations used in partial derivation, it can be shown by induction that 
for any simple regular expression E over E and for any symbol a, D^(a,i?) = 
liE). □ 

Definition 4. We denote by Eb = (S,E' = Exp(E), hjj, Os, 1, 0) the 6-tuple 
defined by: 

• for any E £ Exp(E), hB{E) ^ E, 

• Ob — {fs' I f «s a k-ary boolean functionyul-E'} , where for any Ei, Ek 
elements in Exp(E), 

- E -w F ^ E ■ F, 

- iE'{Ei,..., Ek) ^ E1+E2 ifk^2 andi=y, 

- fE'(i?i, . . . , Ek) = fe(£'i, • . • , Ek) otherwise. 

Proposition 2. The 6-tuple Sb is a support. Furthermore, for any regular 
expression E over E, for any symbol a, it holds T)B{a,E) — -^-[E), where Y)b 
is the derivation via Sb- 

Proof. Since hs is the identity and E' = Exp(E), it is obvious that §b is a 
support. Moreover, since the operators in Ob are the operators of regular 
expressions, it can be shown by induction that for any regular expression E 
over E and for any symbol a, 0^(0, E) = -^[E). □ 

Let us notice that, by definition, the derivation more generally addresses 
unrestricted expressions ; therefore it provides a natural extension for Antimirov 
derivation. See [7j and Section [S] for alternative extensions. 

3.2 Main Properties of Supports 

We now show that the language denoted by the expression associated with any 
derivative D(w,E) is equal to the corresponding quotient. 

Proposition 3. Let D be the derivation via a support § = (E,E,h, O, 1e, Oe). 
Then for any word w in E+, for any expression E in Exp(E), it holds: 
Lih(piw,E)))^w-^{L{E)). 

Proof. By recurrence over the length of w. 

1. Let w — a G E. By induction over the structure of E. 
L(h(D(a,a))) = £(h(lE)) - {e} - a-\L{a)) 

L(h(D(a, b))) = L(h(D(«, 1))) = L(h(D(a, 0))) 
= L(h(OE)) = = a-^L{b)) = a-\L{l)) = a-\L{0)) 
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L{h{D{a,Et))) = L{h{D{a,Ei) -e Ef)) 
= L{h{B{a,E,))) ■ L{El) = a-\L{E,)) ■ L{El) ^ a-\L{El)) 

L(h(D(a, fe , . . . , Ek)))) = i(h(fE(D(a, -Bi), . . . , D(a, Ek)))) 
= fL(i(h(D(a,i?i))), . . . ,L(h(D(a,i?fe)))) 
= iL{a-\L{E^)),...,a-\L{Ek))) 
= a-\iL{L{Ei), . . . , L{Ek))) = a-\L{i,{Ei, . . .,Ek))) 



Let us consider that e £ L{Ei): 

i(h(D(a, E^ ■ E2))) = i(h((D(a, Ei) -e E^) +e D(a, £;2))) 
= L(h(D(a, -E S2)) U L(h(D(a, iJa))) 
= L(h(D(a, El))) ■ L{E2) U L(h(D(a, E2))) 
= a-\L{Ei))-L{E2)yJa-\L{E2)) 
= a-\L{Ei) ■ L{E2)) ^ a-\L{Ei ■ E2)) 

Let us consider that e ^ L{Ei): 

L{h{Dia,Ei-E2))) = LihiD{a,Ei)-EE2)) 
= L{h(D{a,Ei))) ■ L{E2) - a-i(L(i;i)) • L(i;2) 
= a~^{L{Ei)-L{E2)) = a-i(L(i;i -£2)) 

2. Let w — au with a G S and u G S^. According to the recurrence hypoth- 
esis, 

L(h(D(z(;,i?))) = L(h(D(^/,h(D(a,i?))))) 

= u-i(L(h(D(a,S)))) = M-i(a-i(L(£;))) = (au)-i(L(^)) 

□ 

From Proposition [3] we deduce that re{h{D(w, E))) = r^{w^^L{E)). This 
property does not depend whether the derivation is finite or not and since the 
boolean Vg^^E) can be inductively computed for any regular expression E^ any 
support defines a syntactical solution of the membership problem of the language 
L{E). 

Corollary 1. For a given regular expression E, any derivation via a support 
can he used to solve the membership problem for L{E). 

As an example, the support §b of Definition |3] can be used to solve the 
membership test, even if the associated derivation is not finite. 

Given an expression, the finiteness of the set of its derivatives is a necessary 
condition for the construction of an associated finite automaton. It is well-known 
that the set of Brzozowski's derivatives is not necessarily finite whereas the set 
of dissimilar derivatives and the set of Antimirov's derived terms are finite sets. 
We now give two sufficient conditions of finiteness in the general case. The first 
one has already be stated in the case of Brzozowski derivatives [5] . The second 
one is related to the mapping h of the support, that needs to satisfy specific 
properties. 
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Proposition 4. Let D be the derivation via a support § = (E,E, h, O, 1e,0e). 
The following set of conditions is sufficient for the mapping D to be a finite 
derivation: 

1. Ve is associative, commutative and idempotent (HI); 

2. for any k-ary boolean function f, for any k elements £i, . . . ,£k m E, 

D(o,h(fE(£:i, . . .,£k))) = fE(D(a,h(£:i)), . . . , D(a, Mf^))) (H2). 

Proof. Let us show by induction over the structure of regular expressions that 
for any expression E in Exp(I]), the set {D{w,E) \ w £ is finite. 

IfE S SU{1,0}: According to the definition of derivation, the proposition 
holds. 

If E = : Let w be a word in S+ . Let us show by recurrence over the 
length of w that D{w,Ei) is a finite VE-combination of elements in the set 
{D{w', El) -eEi I w' ^ e is a sufHx of w}. 

1. Let w = a gT,. Since D{a,Ei) = D{a,Ei) -e E^, the property holds. 

2. Let w = ua with a e E and u G S+. By definition, D(Ma, i^j) = 
D(a, h(D(?i, By the recurrence hypothesis, D(u,i?J') is a finite VE- 
combination of elements in the set {D(w', Eij-'^El | w' ^ e is a suffix of u). 
According to hypothesis H2, D(a, h(D(t(, El))) is a finite VE-combination 
of elements in the set {D(a, h(D(w', El))) -e El \ u/ ^ e is a suffix of u} U 
{D{a,Ei) -E El}. So, D{ua,El) is a finite VE-combination of elements in 
the set {D(t«', Ei) -m E^ | ^ e is a suffix of wo}. 

As a consequence, since the set {D{uj,Ei) \ w G S"*"} is a finite set by 
induction hypothesis, since Ca,id{[J^^^+{D{w' , Ei) \ w' ^ s is a suffix of 
w}) = Card(U^g2+{D(t«, i^i)}) and since Ve is associative, commutative and 
idempotent, we get: 

Ceiid{{D{w, El) I w e S+}) < 2Card({D(»„,i=;i)|t„eE+})_ 

If E = fe(Ei, . . . , Ek): Let w be a word in E+. Let us show by recurrence 
over the length of w that D{w, fe{Ei, . . . , E^)) = fE(D(ti;, Ei), . . . , D{w, Ek)). 

1. Let w = a e S. According to the definition of D, the property holds. 

2. Let w = ua with a e E and u E E+. By definition, D(ua, fe(-Ei, • • • , Ek)) 
= D(a, h(D(u, fe(i?i, . . . ,Ek))j). By the recurrence hypothesis, D{u,fe{Ei, 
. . . , Ek)) = fE(D(u, El), ... , D(u, Ek)). According to hypothesis H2, 

D(a, h(fE(D(u, El), D(u, Ek)))) = UB{a, h(D(w, ^i))), . . . , D(a, h(D(u, Ek)))) 

= fE{'D{ua,Ei),...,D{ua,Ek)). 

As a consequence, since for all integer j in {1, ... , k} the set {D{w, Ej) \ w € 
E+} is finite by induction hypothesis, we get: 
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Cardi{-Diw,ie{Eu...,Ek)) \ w € E+}) 
< CaTd{{B{w,Ei) I w € E+}) x ••• x Caid{{D{w, Ek) \ w e 

If E = El • E2: Let w be a word in E+. Let us show by recurrence over 
the length of w that either D{'w,Ei ■ E2) = T){w,Ei) -e E2 or D(u',i?i • E2) = 
(D(w, El) -jE E2) Ve S where f is a finite VE-combination of elements in the set 
{D{w' ,E2) I w' 7^ £ is a suffix of w}. 

1. Let = a e E. According to the definition of D, the property holds. 

2. Let w — ua with a G E and u G E+. 

By definition, D{ua, Ei ■ E2) = D(a, h(D(-u, Ei ■ E2))). 
Two cases have to be considered: 

(a) T>{u,Ei ■ E2) = T){u,Ei) -e E2. Either 'D{ua,Ei ■ E2) = D(a,h(D(u, 
Eiy^E2)),or'D{ua,Ei-E2) = (D(a, h(D(u, i;i)-Ei;2)))VED(a, h(S2)). 
According to the recurrence hypothesis, both of these cases satisfy 
the proposition. 

(b) D(m, Ei-E2) = (D(u, Ei)-^E2)W^S where f is a finite VE-combination 
of elements in the set {D(w',i?2) | w' ^ e is a suffix of u). Either 
T){ua,Ei ■ E2) = D(a,h(D(u,£;i) -e E2)) Ve D(a,h(5)) or V)[ua,Ei ■ 
E2) = D(a, h(D(u, El) -E E2)) Ve D(a, h(S2)) Ve D(a, h(5)). Accord- 
ing to hypothesis H2, £' = D{a,h{£)) is a finite VE-combination of 
elements in the set {D(a, h(D(w', E2))) | w' 7^ e is a suffix of u}, set 
that equals {D(w' , E2) | w' 7^ e is a suffix of ua}. 

Consequently, D(ua, £^1 ■£'2) ~ (D(ita. i?i)-E-E2)VE£ where £ is a finite VE- 
combination of elements in the set {D{w' , E2) | w' 7^ £ is a suffix of ua}. 

As a consequence, since the sets {D(w,Ei) \ w G E+j and {D(w,i?2) | w G 
S"*"} are finite by induction hypothesis and since Ve is associative, commutative 
and idempotent, we get: 

Card({D(w, £1 • £2) I w G E+}) 
< Card({D(u;,£;i) | w G E+}) x 2Cai-d({D(^«,£;2)|»eE+})^ 

□ 

The derivation of Definition |3] is an example of finite derivation since 
U is associative, commutative and idempotent, and since for any fc-ary boolean 
function f and for any k elements fi, . . . , in 2^'^p(^\ we have: 

D^(a,hA(fE(fi, . . • ,ffe))) = fE(D^(a,hA(fi)), . . . , 0^(0, h^(£'fc))). 

On the opposite, since -I- is not an ACI law. Proposition |4] does not allow us 
to conclude for the derivation of Definition SI Brzozowski showed in [4] that 
it is possible to compute a finite set of dissimilar derivatives using a quotient 
of the expressions w.r.t. an ACI sum. It can be achieved by considering the 
support S'^ defined as follows: 
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Definition 5. We denote by §^ = (E,E" = 2'^'^?^^), h^, O^, {1}, {0}) the 6- 
tuple defined by: 

• for any £ e 2'^'^p(^), h^Cf ) = Y.Ee£ ^ (Definition\^, 

• O'g = {f^" \ i is a k-ary boolean function}U{-E"} , where for any £i, ... ,£k 
elements in E", 

- £i -E" F = E) ■ F}, 

- h"{£i:.-.:£k)=£i^£2 ifk^2 andi=V, 

- fE"(fi, . . . = {fe(hA(^i), ■ • ■ ,liA(^^fc))} otherwise. 

Proposition 5. The 6-tuple S'g is a support. Furthermore, for any regular 
expression E and for any symbol a, it holds that h^(D^(a,i?)) is the dissimilar 
derivative of E w.r.t. a, where D'^ is the derivation via the support S'^. 

Proof. According to Definition [31 Definition [S] and Proposition [TJ the conditions 
(1) and (3) of Definition [T] are satisfied by S'^. 

The condition (2) of Definition [T] is satisfied since: 

L{\va{£i -e" E)) = L(h^({(E^^6£, F) ■ E])) 
= L{{Y.Fe,J).E) 
- L{T.Fe£^ E) ■ L{E) 
= L{\ia{£i)) ■ L{E) 

Consequently, S'^ is a support. 

Moreover, since the operator U is an ACI law and since the catenation prod- 
uct -E" returns a singleton, it can be shown by induction that for any regular 
expression E and for any symbol a, it holds that h/i(D'^(a, E)) is the dissimilar 
derivative of E w.r.t. a. □ 



4 From Derivation via a Support to Automata 

Computing the set of derivatives of a regular expression E w.r.t. a derivation 
D is similar as computing the transition function 5 of an automaton, where 
S{E,w) — h(D(u',i?)). As far as alternating automata are concerned, the re- 
sulting expression h(D(w,i?)) needs to be transformed into a boolean formula. 
This computation is performed through a base function defined as follows. 

Definition 6. Let T, be an alphabet. A base function B zs a mapping from 
Exp(E) to BoolForm(Exp(E)) such that for any expression E and for any word 
w in E*; 

w e L{E) ^ eY&\r^{B{E)) = 1. 

Definition 7. Let B 6e a base function and D he the derivation via the sup- 
port § = (E. E, h, O, 1e, Oe). Let E he an expression in Exp(E). Let A = 
(E, Q, I, F, 5) he the automaton defined by: 

• Q = Exp(E), 
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1 ifeeL{q), 
otherwise. 



• Va e I],Vg G Q,Siq,a) = B(h(D(a, q))). 



The accessible part of A is said to be the (D. B) -alternating automaton of E. 
Notice that there may exist an infinite number of states. 

Theorem 1. The alternating automaton of a regular expression E rec- 

ognizes L{E). 

Proof. Let A = (S, Q, /, F, 5) be the (D, B)-alternating automaton of E. Let 
D be the derivation via the support § = (E,E, h, O, 1e,0e)- Let w be a word 
in S"*". Let us show by recurrence over the length of w that for any boolean 
formula cj) — i^{qi, . . . ,qk) in FormBool((5), the following (P) proposition holds: 
eval;^(^(0, w)) = eval,^(fB(B(h(D(w, qi))), B(h(D(if;, qk)))). 

1. If w = a G S, by definition of the transition function S, S{(j), a) — fB('5('Zi, a), 
. . . , 6{qk, a)). By construction, for any integer j in {1, . . . , fc}, S{qj, a) = 
B(h(D(a,g,))). 

Since for any state q ^ Q, F{q) = 1 <S4> e G L{q), the following proposition 
holds: evalF(<5(0,a)) = eval,^ (fB(B(h(D(a, gi))), . . . , B(li(D(a, gfe)))). 

2. Let w — au with a G S and u G S^. Then it holds: 

evalF{S{4>, au)) 

= evai p {S{S{(j), a), u)) (Definition of 5) 

= evalF((S(fB('5(gi, a), ■ • ■ , S{qk, a))),u)) (Definition of 6{(j>, a)) 

— eva\p{fM{S{S{qi,a),u), . . . , S{S{qk,a), u))) (Definition of S) 

= eval;^(fB((5(B(h(D(a, 91))), u),..., (5(B(h(D(a, qk))), u))) 

(Construction of S) 
= f(evalF((5(B(h(D(a, qi))),u)), . . . , cvalF((5(B(h(D(a, q^))), u))) 

(Definition of eval^) 

= f(eval,, (B(h(D(u, h(D(a, gi )))))), 

...,eval,,(B(h(D(u,h(D(a,<7fc))))))) 

(Induction hypothesis) 
= eval,, (fB(B(h(D(j/, h(D(a, q,))))), B(h(D(u, h(D(a, qk))))))) 

(Definition of eval^) 

= evaU (fB(B(h(D(aM, 91))), . . . , B(h(D(au, qk)))) (Definition of D) 
Finally, it holds: 

w G L{E) ^ e e L{h{D{w, E))) (Proposition [S]) 

^ eval^^ (B((h(D(w, E))))) ^ I (Definition [H) 

O evalF{6{E,w)) = 1 (Proposition (P)) 

<^ w G L{A) (Definition of the language of an AA) 

□ 
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Previous definitions and properties address non necessarily finite automata. 
We now give sufficient conditions for tlie finiteness of tlie automaton. Tfie basic 
idea is tliat if it is equivalent to derive an expression E or to derive its atoms 
Atom(B(£')), then the set of atoms obtained by a finite derivation is finite. 

Definition 8. LetS = (E, E, h, O, 1e, Oe) be a support, let D be the derivation 
via § and H be a base function. The couple (D, B) satisfies the atom-derivability 
property if for any expression E in Exp(E) and for any symbol a in S; 
Atom(B(h(D(a,i?)))) = Ui=;'GAtom(B(B)) Atom(B(h(D(a, i?')))). 

Theorem 2. Let A be the (D, Ji)- alternating automaton of a regular expression 
E. // D is finite and i/(D,B) satisfies the atom-derivability property, then: 

A is an AFA. 

Proof. Let D be the derivation via the support § = (S,E, h, O, 1e,0e). Let w 
be a word in E+. 

1. Let us show by recurrence over the length of w that for any expression q 
in Q, Atom(B(h(D(w,(7)))) = Atom{6{q,w)). 

(a) If u) = a e E, 5{q,a) = B(h(D(a, g))). Then, Atom(B(h(D(a, g)))) 
= Atom((5((7, a)). 

(b) If It; = ua with a G S and u G S^, by the recurrence hypothesis, 
Atom((5((7, u)) = Atom(B(D(u,q))). 

Atom((5(g, ua)) 

= Atom{S{S{q,u),a)) (Definition of S) 

= Atoni(5(fB((?i, ■■■,<lj),a)) 

(Definition of 6{q, u) with Atom((5(g, u)) = {q[, . . . , q^}) 
= Atom(fB((5(qi, a), . . . , (5(q^, a)) (Definition of S) 

= [Jq'e{q[ g'.} «)) (Definition of Atom) 

= U,'6Atom(5(g,«)) Atom(5((7', aj) (Definition oi{q[,..., q'^}) 

= Ug'GAtom(B(h(D(n,g)))) AtOm(B(h(D(a, q')))) 

(Induction hypothesis and construction of S) 

= Atoni(B(h(D(a, h(D(u, q)))))) (Atom-derivability property) 

2. As a direct consequence of the previous point, since the set {D{w, E) \ w G 
S+} is finite, so are the sets {B(h(D(ii;, E))) \ w G E+} and lJtuGS+ Atom( 
B(h(D(u', q)))). Finally, the set Q, that is equal to lJu)ei;+ Atom((5((7, w)), 
is a finite set. 

□ 

Example 1. Let Da andD'g be the derivations of Definitions^ and Definition\Bi 
Let Ha be the base inductively defined for any expression by B^(£' + F) = 
B^(£') Vb T^AiF), T^AiE) = E otherwise. Let B^ the base defined for any 
expression by Y^siE) = E. It can be shown that any couple in {Dyi,D^} x 
{B^,Bb} satisfies the atom-derivability property. 
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Furthermore, the {T)a,^a)-AFA of E can be straightforwardly transformed 
into the derived term NFA of E defined by Antimirov in flj/; the (D^,Bb)- 
AFA of E can be transformed into the partial derivative DFA of E defined by 
Antimirov in fll/: the (D^,Bb)-^F^ of E can be transformed into the dissimilar 
derivative DFA of E defined by Brzozowski in J^. Notice that the (D^,B^)- 
AFA of E can be transformed into a NFA that is different from the NFA of 
Antimirov. 

Finally, the base Be inductively defined by Jici^eiEi, . . . , Ek) — im{^ciEi), 
. . . , Bc(£^fc)) for any operator fe, Bc(£') — E otherwise, provides an AFA 
construction both from Da o,nd D'^ . 

5 Derivation via the Set of Clausal Forms 

In this section, we show that the set of clausal forms over the set of reg- 
ular expressions and equipped with the right operators is a derivation sup- 
port and that the associated Dc derivation is finite. Furthermore we prove 
that the atom-derivability property is satisfied whenever the Dc derivation 
is associated with a base function in the set {B^,Bb,Bc}- Finally we illus- 
trate these results by the construction of the (Dc,Bc)-AFA of the expression 
E = ({ab)*a)yJORe{{abab)*a). 

Let us first recall some definitions about clausal forms and their operators. 
A clausal form over a set X is an element in C{X) — 2^ where X = {x | 
a; e X}. Let © and ® be the two mappings from C{X) x C{X) to C{X) and 
be the function from C{X) to C{X) defined for any Ci,C2 in C{X) by: 

• Ci ©Ca =Ci UC2, 



It can be shown that for any element x ^ X \J X , for any clause Ci, C2, C3 in 



C{X), the following conditions are satisfied: 

• Ci® (C2 © C3) = (Ci © C2) © (Ci © C3), 

• e©{{x}} = {{a;}}, 

• ©(Ci ©C2) = ©(Ci) ©©(C2), 

• ©(Ci ©C2) = ©(Ci) ©©(C2). 



• Ci © C2 = u 



(Ci,C2)eCixC2 
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Furthermore, let us notice that there exist clauses Ci,C2,C3 such that 

Ci + Ci or Ci © (C2 ® C3) = (Ci ffi C2) «> (Ci ffi C3). 

From now on, we will consider the set C = C(Exp(S)) of the clausal forms 
over the set of regular expressions. We now explain how a clausal form over 
the set of regular expressions is transformed into a regular expression. Let us 
consider the function he defined from C to Exp(S) for any element C in C by: 
if C = 0, 

=< y / ^ ^ ^' otherwise 

^^^n AeEec^i^) otherwise, 



where c(E) 



-.e(S') if E = E' 

E otherwise. 

Let us now give the definition of the support operators. For any fc-ary 
boolean function f, let fc be the operator from (C)*' to C associated with f 
defined by: 

• fc(Ci, ...,Ck) = ®6=(6i,...,6fc)eB'«|f(6)=l ^l<j<k g{bj,Cj), 

. wheregfe,C,) = | ^tLwise. 

The operator -c from C x Exp(S) to C is defined, for any clause C in C and 
for any expression F in Exp(S), by: 

C-cF = Ucec{hc({C}) • F}. 
Finally, we consider the operation set Oc defined by 

Oc = {®, -c} U {fc I f is a A:-ary boolean function different from V}. 
Let us notice that, by definition, the operator G (resp. ®) is equal to the 
operator -ic (resp. Ac) whereas the operator © is different from Vc, since: 
Ci Vc C2 = ((©(Ci) © C2) © (Ci © ©(C2))) © (Ci © C2). 
There exist several expressions (combinations of ©, © and ©) for a given 
fc operator. As an example, with the ternary V^(&i,&2,&3) = 61 V 62 V 63 is 
associated an operator that can be expressed as the combination of two 
operators: 

V3(Ci,C2,C3) = (Ci0C2)®C3. 

Reduced expressions can be found using Karnaugh maps for instance. 
We now consider the 6-tuple Sc = (S, C, Oc) he {{!}}, 0) and show that it 
is a support. 

Proposition 6. The 6-tuple Sc is a support. 

Proof. Properties of support are trivially checked for the clauses {{!}}, and 
{0}. Let us consider that C,Ci, . . . ,Ck are elements in C \ {0, {0}}. According 
to the definitions of ® and ©: 
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i(hc(Ci©C2)) =L(hc(CiUC2)) 

(E)) 

eEeC 

c{E)) 

eEeC eEeC 

c{E)) 

= L(hc(Ci))Ui(hc(C2)) 

= L(hc(Ci)+hc(C2)) 

i(hc(Ci ®C2)) = i(hc(U(c,.c.)ec,xcJC'i UC2}) 
= U(ci,c.)ecixc. i(hc({Ci U C2})) 
= U(Ci,C2)eCixC2 ■^(/^eCeCiuC2c(-E')) 
= U(Ci,C2)eCixC2 i(AeCeCic(i;)) n L(AeceC2c(i^)) 
= U(c„C2)ec.xC2 ^(hc({Ci})) n L(hc({C2})) 
= Ucec. i(hc({C})) n UceC2 ^(hc({C})) 

= L(hc(Cl) Ae hc(C2)) 

Moreover, 

L{hc{e{{{c}m = i(hc({{n(c)}})) 

L(hc({{ii;}})) ifc = E 
L{hci{{E}})) if c = E 
L{^eE) iic^E 
L{E) if c = E 
-^LiE) ifc = E 

^LiLheE)) iic=E 

-L(hc({{c}})) ifc = E 
-i(L(hc({{c}}))) ifc = E 

= ML{hc{{{c}m 

Consequently : 

i(hc(e(Ci))) = L(hc(0c,eci ecec {{n(c)}})) 
= nc,6C,i(hc(©cec{{n(c)}})) 
= nc.6C. Uceci(hc({{n(c)}})) 
= nc.ec. Ueci(hc(e({{c}}))) 
= nc.ec. aec-L(i(hc({{c}}))) 

= r\c.ec.-L{ncecL{w{{{c}m 

= -i(Uc.ec. aeci(hc({{c}}))) 
= -L(L(hc(Ci))) 
= L(^e(hc(Ci))) 
Hence, according to definition of fc: 
L(hc(fc(Ci, . . . ,Cfe))) = L{'hc{®b={bi,...M\t(b)=i ^i<j<k g{bj,Cj))) 
= Ub=(&i,...,6,)|f(6)=ini<j<fe-^(lic(g(&j,Cj))) 
_ll n / ^(hc(C,)) if6,= 

-U6=(6i,...,6,)|f(6)=ll ll<j<fei L(hc(eCj)) iffoj = 

_ii n } L{hcic,)) if 6, 

-U6=(6i,...,6,)|f(6)=lMl<j<fc| ^^L(hc(C,)) if6j 

= fL(i(hc(Ci)),...,L(hc(Cfe))) 

= L(fe(hc(Ci),...,hc(Cfc))) 
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Furthermore, 

eEeC 

c{E)).F})) 

= i}cecL{NeEeAE))-L{F) 
c{E)))-L{F) 

= i(hc(C)) • L(F) 
= L(hc(C)-F) 

□ 

We now study the properties of the derivation Dc associated with the support 
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Theorem 3. Let E be an expression in Exp(I]). Let (D,B) be a couple in 
{Dc} X {Ba,Bb,Bc}. Then: 

The (I), Ji)- automaton of E is an AFA recognizing L{E). 

Proof. (I) Let us show that the derivation Dc satisfies the sufficient conditions 
for finiteness of Proposition ID (a) Since © = U, the function © is associative, 
commutative and idempotent (HI), (b) According to the definition of the 
operators ©, (8) and G: 

Dc(a,hc(Ci©C2)) =Dc(a,hc(CiUC2)) 

-n r / ifC = 0, 

- ucia,Z.ceCiUC2 I /\^^^^c{E) otherwise, 

_l I / Dc(a,-eO) if C^0, 

Uceciuc. I Dc(a,Aei5ecc(-^)) otherwise, 

^ r Dc(a,-eO) if C = 0, 

Uceci| Dc(a,AeB6cc(^)) otherwise, 

r Dc(a,-eO) ifC = 0, 

uUceC. I Dc(a,Aei;gcc(^)) otherwise, 

_n r / ifC = 0, 

-Ucla,2.ceci I AeiS6cc(^) otherwise, 

nn r / ^-=0 if C = 0, 

UUcla,2.cec. I [\^^^^c{E) otherwise, 

= Dc(a,hc(Ci))©Dc(a,hc(C2)). 
Dc(a,hc(Ci ®C2)) - Dc(a,hc(U(Ci,c.)eCixc.{Gi UC2}) 
^ U(Ci,C2)GCixC2 Dc(a,hc({Ci UC2}) 
= U(Ci.C2)GCixc. Dc(a,hc({Ci} U {C2}) 
= U(Ci.c.)GCixc. Dc(a,hc({Ci}) UDc(a,hc({C2}) 
= Dc(a,hc(Ci))«)Dc(a,hc(C2)). 
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Dc(a,hc(e(Ci))) = Dc(a,hc(«)CieCi ©cec {{n(c)}})) 
= (8)CieCiDc(a,hc({©cGc{n(c)}})) 
= «)CiGCi ®cGC Dc(a, hc({{n(c)}})) _ 

= ©c..c.®c.cDc(a,hc(| ^' f^^;;:;^^ )) 



7ieCi ®cec 



7lGCi ©cGC 



Dc(a,hc(c')) if c = c', 
Dc(a, hc(c)) otherwise. 
Dc(a, c') if c = c', 
Dc(a, ~'ec)) otherwise. 
eeDc(a,c') ifc = ?', 
0Dc(a, c)) otherwise. 
eDc(a, -.gc') ifc = ?', 
0Dc(a, c)) otherwise. 
= «)CiGCi ©cec ©Dc(a,hc({{c}})) 
= ©©CiGCi (8)cGcDc(a,hc({{c}})) 
= ©Dc(a,EcieCi AecGchc({{c}})) 
= e(Dc(a,hc(Ci))). 
Finally, since any fn is defined as combination of 0, ® and G operators, 
hypothesis H2 holds. According to Proposition SI Dc is finite. 

(II) Let us show that any couple in {Dp} x {B^, B^, Bp} satisfies the atom- 
derivability property, (a) By definition of B^, the atom-derivability property 
is satisfied by (Dc,Bb). (b) By induction over the structure of E. Let a be 
a symbol in S. (i) If G E U {1,0} or if £; = F ■ G or if = F*, since 
Ba{E) ^ BciE) = E, Atom(BA(F)) Atom(Bc(F)) = {E}. (ii) Let f be 
a k-avy boolean function. Let us first prove that the equation of the atom- 
derivanility property is satisfied for the operators ©, (g) and 0. If Ci or C2 equal 
or {0}, equation is trivially satisfied. Let Ci and C2 be two clauses different 
from and {0}. 

Atom(Bc(hc(Ci UC2))) = Atom(Bc(EceCiUC2 ^eEecc{E))) 
= Atom(VBCGCiUC2Bc(Ae£;ecc(£^))) 
= UceCiuC2Atom(Bc(Ae_EeCc(F))) 
= UceCiAtom(Bc(Aei3ecc(F))) 

U U ceC,Atom{Bc{AeEGCc{E))) 
= Atom(VBCeCiBc(Ae_BGCc(-B))) 

UAt0m(VfflCGC2Bc(Ae£;GCc(F))) 

= Atom(Bc(E ceci Ae e&cc{E))) 

UAtom(Bc(E ceC2 Ae EeCc{E))) 
= Atom(Bc(hc(Ci))) U Atom(Bc(hc(C2))) 
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Atom(Bc(hc(Ci ® C2))) - Atom(Bc(hc(U(Ci,c.)eCixcJC'i UC2}))) 
= U(c,,c.)ec,xc. Atom(Bc(hc({Ci U C^}))) 
= U(c„c.)ec,xc. Atom(Bc(hc({Ci}))) 

UAtoni(Bc(hc({C2}))) 
= U(c„c.)ec,xc. Atom(Bc(hc({Ci}))) 

UU(Ci,c.)eCixc.Atom(Bc(hc({C2}))) 
-Uc,6C,Atom(Bc(hc({Ci}))) 

UUc,6C.Atom(Bc(hc({C2}))) 
= Atoni(Bc(hc(Ci))) U Atom(Bc(hc(C2))) 
Atoni(Bc(hc(eCi))) = Atom(Bc(hc({{n(a;)}}))) 

= UcGC, Uec Atom(Bc(hc({{n(x)}}))) 
= Uc6C, Uec Atom(Bc(hc({{x}}))) 

= Atom(Bc(hc(Ucec, U.ecii^}}))) 
= Atom(Bc(hc(Ci))) 
Hence, since any operator fc is a composition of ©, €5 and Q: 

Atom(Bc(hc(fc(Ci, . . • ,Cfc)))) = Ui<,<fc Atoni(Bc(hc(C,))). 
Consequently: 

Atom(Bc(lic(Dc(a,fe(£;i, . . .^Ek))))) 
= Atoni(Bc(hc(fc(Dc(a, Ei), Dc(a, Ek))))) 
= Ui<,<fe Atom(Bc(hc(Dc(a, E,)))) 
= Ui<,<feUiS'eAtom(Bc(£,))Atom(Bc(hc(Dc(a,i;')))) 
= UB'6Atom(Bc(B)) Atom(Bc(hc(Dc(a, E'm 
Furthermore, if f 7^ V, BA(fe(£^i, • ■ .,Ek)) = feiEi,. . .,Ek). 

Atom(BA{fe{Ei,. . . , Ek))) - {fe{Ei, E^)}. 

Finally, 

Atom(B^(hc(Ci UC2))) = Atoin{B A{Y.cec,uC2 ^eEecc{E))) 
= Atom(VBceCiuC2BA(Ae_Egcc(i^))) 
= U CeCiuC2-^tom{BA{AeEecc{E))) 
= U cec^Atom{BAiAeEecciE))) 
U\J cec^^tomiBAiAeEecciE))) 

= Atom(VBCeCiBA(Aei36Cc(£'))) 

UAtom(VBceC2BA(AeiSGCc(i;))) 
= Atom(B^(^CGCi Ae Eecc{E))) 
UAtoni(BA(^ cec. Ae EeCc{E))) 
Atom(B^(hc(Ci))) U Atom(BA(lic(C2))) 
Consequently, any couple in {Dc} x{B^, B^, Be} satisfies the atom-derivability 
property. 

(Ill) According to Theorem [21 from (I) and (II), the theorem holds. □ 

The following example illustrates the computation of an AFA from a regular 
expression. In order to improve readability, regular expressions are simplified 
according to the following rules: 

E + = + E = E 

E -0 = 0- E = 

E -l^l- E = E 
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Example 2. Let E = {{ab)* a)XORe{{abab)* a) . We now construct the (Dc, Bc)- 
automaton of E, that is the AFA {T,,Q,E,F,S). We first compute the deriva- 
tives C of E w.r.t. any symbol in T,, according to the derivation Dc and then 
compute the derivatives of the expressions that are atoms of the base of he (C). 
This scheme is repeated until no more expression is produced. 
Dc(a, E) = (e(Dc(a, iab)*a)) ® Dc(a, {abab)*a)) 
©(Dc(a, {ab)*a) ® e(Dc(a, (ababya))) 
= {e{{{b{ab)*a}, {1}}) ® {{bab{ababya}, {1}}) 
(B{{ {b{abya }_, {1}} e{{{bab{ababya}, {1}})) 
= ({{6(a6)*a,T}} (g) {{babiababy a} , {1}}) 



Bc{b,E) 
Dc(a,6(a6)*a) 
Bc{b,b{abya) 
Dc(a, (abya) 



®( {{6(a6)* a}, {1}} (g) {{5a6(a6a6);^, 1}}) 
= {{6(a6)*a, 1, &a&(a&a&)*a}, {6(a&)*a, 1, 1}, 
{5(afe)*a, bab{ababya, T}, {1, bab{ababya, T}} 



= {{(«&)*«}} 
= {{biabya}, 

m 

Dc(6, (a6)*a) = 
Dc(a,l) =0 
Dc(&,l) =0 
Dc(a,6a&(a&a&)*a)= 
JVom </ie computation of the derivatives, we deduce: 



Dc(6, bab{ababya)= {{ab{ababy a}} 
Dc(a,ab{ababya) =- {{6(o6o6)*a}} 
V)c{^,ab{ababya) = 
Dc(a,&(a&a&)*a) =0 



Dc(6, &(a&a&)*a) 
Dc(a, (ababya) 



= {{{ababya}} 
= {{bab{ababy a} , 
{1}} 



Dc(6, {ababya) = 



• the set Q = {qi, . . . , gg} of states : 

qi = E, q2 = b{ab)*a, q^ = 1, q^ = bab{ababya, 
(75 = ab{ababya, q% = b{ababya, 57 = {abya,qs = {ababya, 

• the function F from Q toM: 

F{q3) = F{qr) = F{qs) = 1, 
F{qi) = F{q2) = F{q^) = f (95) = F{qe) = 0, 

• and the function 6 from Q x T, to BoolForm((5) 





qi 


92 


93 


94 


95 


96 


97 


98 


a 


~^Bq2 Ab ~^Mq3 Ab 94 

Vb 

~'b92 Ab 93 Ab 93 
Vb 

92 Ab "IB 94 Ab ~'b93 

Vb 

93 Ab "IB 94 Ab ~'b93 











96 





92 Vb 93 


94 Vb 93 


b 





97 





95 





98 









Let us notice that in this example, substituting Bc{E) to E in I would produce 
a smaller automaton. 
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6 Conclusion 



This paper provides two main results. First, the theoretical scheme of deriva- 
tions via a support allows us to formalize intrinsic properties of (unrestricted) 
regular expression derivations. As a by-product we obtain a kind of unification 
of the classical derivations that compute word derivatives, partial derivatives 
or extended partial derivatives. Secondly, the notion of base function that as- 
sociates a boolean forrrmla with a regular expression allows us to show how to 
deduce an alternating automaton equivalent to a given regular expression from 
the set of its derivatives via a given support. We are now investigating new 
features: it is possible, for example, to define morphisms from one support to 
another one in order to study the relations between the associated automata. 
An other perspective is to replace the derivation mapping by an other mapping 
(right derivation or left-and-right derivation for example, or any transforma- 
tion with good properties). There also exist well- know algorithms to reduce 
boolean formulas (Karnaugh, Quine-McCluskey); we intend to investigate re- 
duction techniques based on derivation. Finally we intend to extend the theo- 
retical derivation scheme in order to handle the derivation of regular expression 
with multiplicities. 
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